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1  Introduction 

There  is  a  large  and  growing  body  of  algorithms  for  “visual  servoing”  (VS)  —  motion  control  using  visual  feedback. 
Traditionally,  VS  algorithms  generate  motor  reference  velocities  to  register  a  camera’s  current  view  of  a  scene  with 
a  previously  stored  view  (for  a  tutorial,  see  [7]). 

We  seek  to  move  VS  toward  a  systematic  theory  by  characterizing  the  geometry  of  “visible”  configurations  of 
a  visual  target  relative  to  a  camera.  In  particular,  for  a  specific  target  geometry  we  present  a  diffeomorphism  —  a 
smooth  and  smoothly  invertible  transformation  —  from  an  appropriately  defined  visible  set  of  configurations  to  an 
image  space.  We  believe  this  transformation  will  enable  the  construction  of  purely  image-based,  global  dynamic 
VS  algorithms. 

1.1  Background 

A  significant  challenge  involves  representing  rigid  motions  in  terms  of  visually  measured  quantities.  Ideally,  such 
a  representation  should  enable  effective  encoding  of 

•  Configuration  and  State,  e.g.  position  and  velocity  or  position  and  momentum  for  Lagrangian  or  Hamil¬ 
tonian  systems. 

•  Tasks  and  goals,  e.g.  trajectories  in  the  state  space  or  points  in  the  configuration  space. 

•  Obstacles,  e.g.  the  edge  of  the  field-of-view  (FOV)  for  VS  systems. 

•  Uncertainty,  e.g.  sensor  and  actuator  noise  or  parametric  error. 

There  are  several  candidate  representations  of  image-based  rigid  motion  to  consider  from  the  literature.  The 
classical  approach  to  “2D  VS”  employs  the  projection,  treated  as  a  vector  in  R",  of  an  arbitrary  set  of  feature 
points  [7].  The  redundancy  of  using  extra  feature  points  seems  to  confer  robustness  to  measurement  noise  in  any 
one  of  the  feature  measurements.  However,  the  movement  of  features  is  constrained  by  the  underlying  rigid  motion, 
rendering  image-based  control  and  motion  planning  in  image  space  challenging  for  large  deviations  from  a  goal. 
Notwithstanding  those  challenges,  Corke  and  Hutchinson  [2]  created  a  2D  kinematic  algorithm  for  6DOF  VS  that 
seems  (empirically)  to  have  a  very  large  basin  of  attraction  while  keeping  features  in  the  FOV.  Their  algorithm 
employs  a  clever  choice  of  image  features  which  helped  motivate  the  choice  of  features  used  in  this  paper. 

A  more  recent  approach  uses  partial  pose  reconstruction:  given  a  sufficient  number  of  feature  points,  the  relative 
pose,  up  to  a  scale  in  translation,  between  two  views  may  be  determined  without  exploiting  a  geometrical  model  of 
the  points.  Using  this  technique,  researchers  developed  six  DOF  VS  algorithms  robust  to  calibration  uncertainty 
[12,  15].  It  is  worth  noting  that  the  methods  used  require  sufficient  point  correspondences  between  views  to  fully 
reconstruct  a  geometric  model  of  the  visual  target  [10].  Application  of  this  method  to  contexts  besides  full  six 
DOF  VS  remains  a  challenge. 

Alternatively,  one  may  recover  the  complete  pose  of  a  camera  with  respect  to  a  target  by  exploiting  a  model  of 
the  target  [11].  Vision-based  controllers  using  full  pose  reconstruction  are  often  referred  to  as  “3D  VS”  algorithms. 
Model  based  pose  reconstruction  requires  fewer  feature  points  than  the  model-free  approach  described  above,  and 
has  the  added  advantage  of  fully  recovering  feature  depth,  effectively  reducing  the  camera  to  a  “virtual  Cartesian 
sensor.”  Representing  visibility  obstacles,  such  as  the  FOV  or  self-occlusions  is  less  parsimonious,  but  can  be  done 
[5].  Formal  results  demonstrating  parametric  robustness  of  VS  systems  using  this  method  remain  elusive. 

Generalized  image-based  coordinates  have  proven  extremely  effective  in  a  few  narrow  contexts  [3,  5,  16].  Gen¬ 
eralized  coordinates  describe  kinematic  motion  with  one  variable  per  mechanical  DOF.  Lagrange’s  equations,  for 
example,  are  usually  written  using  such  coordinates.  Hence,  this  approach  enables  the  expression  of  dynamical 
equations  of  motion  in  terms  of  measured  quantities  on  the  image  plane.  Obstacles  such  the  FOV  and  self-occlusions 
often  appear  as  the  boundary  of  a  compact  manifold  in  image-space  and  hence  their  avoidance  may  be  cast  as  an 
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instance  of  dynamical  obstacle  avoidance  [5].  Although  quite  robust  in  practice,  formal  guarantees  of  robustness 
to  noise  or  parametric  uncertainty  for  this  framework  remains  an  open  problem. 

1.2  Contribution 

To  date,  global  image-based  representations  of  configuration  have  been  applied  only  to  three  DOF  systems.  This 
paper  builds  on  previous  results  in  a  key  way:  we  present  an  image-based,  geometric  representation  of  six  DOF 
rigid  motion.  Our  development  of  a  global  representation  of  “visible”  rigid  motions  viewed  through  the  projection 
of  a  set  of  features  should  help  pave  the  way  for  new  global,  dynamic  VS  systems. 

Organization.  In  Section  2,  we  employ  a  specific  target  geometry  —  a  sphere  with  a  few  markings  —  to  create 
a  global  image-based  representation  of  motion  for  six  DOF  VS.  Included  in  our  development  is  a  simple,  purely 
image-based  representation  of  the  so-called  image  Jacobian  (made  possible  since,  as  we  show,  the  image  and  task 
spaces  are  diffeomorphic) .  In  Section  4,  we  suggest  a  method  for  using  our  diffeomorphism  for  kinematic  or  dynamic 
control,  although  there  is  much  open  work  to  be  done  in  this  endeavor.  Finally,  we  give  some  concluding  remarks 
in  Section  5. 


2  Six  DOF  Diffeomorphism  to  Image-space 

We  assume  a  visual  target  may  be  designed  to  our  specifications,  so  we  may  explore  new  image-based  representations 
of  rigid  motion.  In  cases  in  which  we  have  the  freedom  to  design  visual  targets  —  for  example  when  designing 
docking  stations  for  space  craft,  helicopter  landing  beacons,  or  visual  targets  for  a  factory  setting  —  this  approach 
may  lead  to  novel  target  designs  that  ease  the  control  problem.  More  generally,  it  is  hoped  that  the  insight  drawn 
from  taking  this  approach  may  enable  us  to  reinterpret  target  geometries  over  which  we  have  have  less  design 
freedom. 

Consider  the  problem  of  moving  a  rigid  target  object  in  six  DOF  relative  to  a  perspective  camera.  The  rigid 
target  considered  is  as  follows: 

1.  A  spherical  body.  Consider  a  spherical  body  of  radius  g.  As  the  body  moves  away  from  the  camera,  its 
projection  gets  smaller.  Roughly  speaking,  the  position  and  size  of  the  body’s  image  encodes  the  position  of 
the  center  of  the  body  relative  to  the  camera. 

2.  A  single  point  on  the  body.  Adding  a  visible  point  to  the  body  breaks  the  visual  symmetry,  allowing  us 
to  resolving  two  rotational  DOF’s  from  the  location  of  the  feature  point  on  the  image. 

3.  A  unit  vector  tangent  to  the  body.  The  final  degree  of  freedom  is  resolved  by  considering  the  orientation 
on  the  image  of  a  projected  vector  attached  to  our  feature  point  on  the  body. 

Zhang  and  Ostrowski  [16,  17]  developed  the  idea  of  projecting  a  spherical  body  to  an  image  plane  for  VS  of  a 
blimp  relative  to  a  large  ball.  Using  a  “flat”  image  plane,  the  resulting  image  is  an  ellipse,  which  they  approximate 
as  a  circle  by  assuming  that  a  slice  of  the  spherical  body  parallel  to  the  image  plane  is  projected.  The  present 
paper  builds  on  that  work,  employing  a  more  ‘exact’  diffeomorphism  to  the  image-space,  as  well  as  incorporating 
additional  markings  on  the  body  whose  projection  encodes  rotational  information. 

2.1  Notation  and  Definitions 

At  the  risk  of  burdening  the  reader  with  formalism,  we  present  the  following  definitions  to  enable  a  precise  geometric 
description  of  the  domain  and  range  of  a  camera  viewing  rigid  motions. 

An  affine  point  p  £  A3  has  homogeneous  coordinates  p  =  [pi  P2  P3  l]  with  respect  to  some  rigid  frame. 
Note  that  TA3  =  A3  x  IR.  ,  and  that  1R.  acts  on  points  to  translate  them  in  the  usual  way,  so  that  if  v  — 
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Table  1:  List  of  symbols. 


Symbol 

Description 

o,p,b, ...  £  E3 
v,  e, . . .  £  R3 

61, 62,  ■  ■  ■  £  R3 

7r  :  {E3  —  oc}  — >  S2 

Euclidean  points  (Roman) 
vectors  (boldfaced) 
standard  basis 

image  projection  model  -  spherical  panoramic  camera 

T  =  {o,i,j,k} 
Fc,Fb 
pb,vb 

H  £  SE(3) 

R  £  SO  (3) 
d£  R3 

pc  =  Hpb,  vc  =  Rvb 

rigid  coordinate  frame,  0  £  E3  and  i,j,k  £  R3 

camera  frame  and  body  frame 

point,  p,  and  vector,  v,  with  respect  to 

rigid  transformation  of  Rb,  relative  to  Tb 

rotation  effected  by  H ,  columns  R  =  [n  r2  r3] 

translation  effected  by  H 

point,  p,  and  vector,  v,  with  respect  to  Tc 

v  :  SE(3)  R 

V  C  SE(3) 

measure  of  feature  visibility,  (5) 

set  of  “visible”  configurations,  H  £  V  <=>  v{H)  >  0 

A  e  (0, 1) 
s  e  S2 

Q  £  SO(3) 

(Q,  A,s)  =  c(H) 

I 

radius  on  image  sphere  of  body,  (3) 
unit  vector  pointing  toward  body  centroid,  (3) 
image-based  rotation,  columns  Q  =  [q\  q2  Q3] ,  (7) 
camera  map,  (8) 

image  feature  space,  X  c  SO(3)  x  (0, 1)  x  S2,  (9) 

[ui  V2  V3]T  £  R3  and  p  €  A3,  then  p+v  =  \p\+v\  p2  +  u2  P3+V3  l]T  €  A3.  Two  points  cannot  be 
“added”  together,  but  if  p,  b  €  A3  then  v  =  p  —  b  £  R3  is  the  vector  such  that  p  =  b  +  v.  Adding  the  usual  metric 
structure  to  affine  space  A3  yields  Euclidean  space  E3  where  the  distance  between  two  points  is  given  by  the  two 
norm  of  their  difference,  ||p  —  6||  (a  measure  independent  of  the  choice  of  rigid  frame). 

A  rigid  frame,  T ',  is  defined  by  its  origin,  o  £  E3,  and  three  mutually  orthogonal  unit  vectors,  i,j,k  £  R3,  that 
create  a  right-handed  frame.  Consider  a  full  perspective  (“pinhole”)  camera  with  frame  Tc  such  that  oc  is  located  at 
the  pinhole  (or  optical  center),  with  kc  aligned  with  the  optical  axis.  The  pinhole  camera  projects  points  in  the  open 
half  space  “in  front”  of  the  camera  to  an  image-plane  pair,  given  by  via  the  map,  7t+{E3  :  ( p  —  oc )  •  kc  >  0}  -  R2, 
expressed  in  camera  frame  coordinates 


7T+(p) 


f_  Pi 
P3  lP2_ 


P3  >  0, 


(1) 


where  /  is  the  camera  focal  length.  The  camera  observes  features  of  a  rigid  body,  affixed  with  rigid  frame  T\,.  Let 


H  = 


R 

0T 


€  SE(3), 


where 


R  =  [n  r  1  r3]  £  SO(3),  d  £  R3, 


denote  the  rigid  transformation  of  T\,  relative  to  Tc.  A  point  expressed  with  respect  to  the  body-frame  as  pb, 
appears  as  pc  =  Hpb  with  respect  to  the  camera  frame.  Similarly,  if  vb  is  a  vector  in  the  body  frame,  then 
vc  =  Rvb  is  the  same  vector  with  respect  to  the  camera  frame. 

Hamel  et.  al.  [6]  remap  the  image  plane  to  a  sphere  to  recover  some  symmetry  that  is  “broken”  by  a  flat  image 
plane.  This  approach  has  also  been  used  in  the  structure  from  motion  (SFM)  literature  [1].  Let  p  =  (p  —  oc)  and 
note  that  the  unit  vector,  p/||p||  may  be  recovered  from  the  image-plane  pair  in  (1)  since 


P  [7T+(p)' 

IIpII  ~[  f  . 


7T+(p) 

/ 


,  P3  >  0 
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with  respect  to  the  camera  frame.  Of  course,  this  assumes  that  we  know  the  parameter  /  (or,  more  generally,  all 
so-called  “intrinsic”  camera  parameters,  omitted  to  simplify  the  presentation).  Motivated  by  this  observation,  we 
consider  for  convenience  a  “panoramic”  spherical  camera 


7r :  (E3  —  {oc})  — >  S2 

P 

:  V  ►  Ti — rr  where  p  =  (p  —  oc) . 

IIpII 


(2) 


For  the  purposes  of  this  paper,  S2  =  {v  £  R3  :  v  ■  v  =  1}  C  R3.  For  the  camera  map,  S2  corresponds  to  the 
unit  tangent  space  of  E3  at  oc,  namely  “the  set  of  unit  vectors  originating  from  the  camera  origin.”  To  keep 
features  within  a  finite  FOV,  one  may  introduce  an  appropriate  image-space  “obstacle”  into  the  controller  design 
(see  Section  4). 


2.2  Image-based  Translation 

Attach  the  body  frame  at  the  center  of  the  sphere,  so  that  the  location  of  the  body  relative  to  the  camera  origin 
is  given  by  Of,  —  oc  =  d.  If  ||d|j  >  g  —  i.e.  the  body  remains  bounded  away  from  the  camera  origin  —  then  the 
surface  of  the  body  double  covers  a  topological  disc  on  S2  via  the  map  n.  The  edge  of  the  disc,  a  planar  slice  of 
the  image-sphere,  is  a  perfect  circle  of  radius 

A=pr  e<M<0°- 

(The  circle  radius,  A,  appears  dimensionless  because  the  image-sphere  was  normalized  to  unit  radius).  The  center 
of  the  circle  on  the  image-sphere  is  in  the  direction  of 

d 

S~W\ 

and  is  readily  measurable  from  the  projection  of  the  body. 

Let  B  :=  {d  g  R3 :  ||d||  >  g}  denote  the  translations  of  the  body  origin  that  keep  it  a  body  radius  away  from 
the  camera.  We  now  have  a  diffeomorphism  —  a  smooth  and  smoothly  invertible  function  —  from  locations  of  the 
body  to  image  measurements,  Ci :  B  — >  (0, 1)  x  S2,  given  by 


ci :  d  i— >  (A,  s) . 


(3) 


The  inverse  of  Ci  is  given  simply  by 

cr1(A,S)  =  |s. 


(4) 


2.3  Image-based  Rotation 

To  break  the  rotational  symmetry  of  our  spherical  rigid  body,  attach  a  visible  feature  point,  6,  to  its  surface,  and 
a  unit  vector  a  tangent  to  the  body  at  that  point.  For  convenience,  align  the  body  frame  so  that  origin  coincides 
with  the  center  of  the  body,  and  the  unit  vector  ( b  —  Ob)/g  lies  along  the  negative  axis.  Hence,  in  the  body 
frame  bb  =  [0,  0,  —g,  l]T. 

As  we  will  show,  the  projection  of  b  to  the  image-sphere,  q\  =  7 r(6),  encodes  two  rotational  degrees  of  freedom. 
We  encode  the  final  degree-of-freedom  by  projecting  a  unit  vector  or  “arrow”,  a,  tangent  to  the  body  at  the  point 
b.  In  practice,  the  vector  a  may  be  approximated  by  two  distinguishable  points  on  the  surface  of  the  sphere.  Again 
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Figure  1:  Projection  of  a  spherical  body  with  a  feature  point  on  it  to  the  image-sphere.  The  image-plane  measure¬ 
ment  is  given  by  y  =  ( Q ,  A,  s )  =  c(H). 

for  convenience  we  assume  the  vectors  body-fixed  representation  is  simply  ab  =  e 2-  Let  b  =  b  —  oc  denote  the 
vector  from  the  camera  origin  to  the  body  point  b.  Recalling  that  the  rotation  matrix  R  has  columns  (ri,  r2,  r3), 
then  with  respect  to  the  camera  frame,  we  have 

bc  =  ^  =  Hbb,  where  bc  =  d  —  gr 3,  and  ac  =  Rab  =  r2  . 

Note  that  (b  —  Ob)  •  a  =  —g  e3  ■  e2  =  0. 

Some  configurations  cause  the  body  to  occlude  the  feature  point,  b.  This  occurs  when  (b  —  oc)  •  (ot,  —  b)  becomes 
negative.  Hence,  we  define  a  “visibility”  function  [5],  v ,  and  associated  “visible  set”  of  rigid  transformations,  V  by 

v(H)  :=  ( d  -  gr3)  ■  r3  and  V  :=  {H  G  SE(3) :  v(H)  >  0}  .  (5) 

Note  that  >  0  ||d||  >  g,  i.e.  d  g  B  =  {d  €  M3  :  |jrf|j  >  p}. 

The  projection  of  ( b ,  a)  G  T E3  to  the  image  sphere  is  modeled  by 

Tn  :  (6,  a)  1— >  (n(b),TbTr  ■  a )  G  TS2  . 


We  are  not  concerned  with  the  length  of  the  projection  of  a ,  only  the  direction.  Hence,  consider  the  unit  tangent 
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map  T1^  represented  in  the  camera  frame  by 

Txn  :  (b,  a)  i->  (q1,  q2)  where  (6) 

d-gr3  =  T  Qlac  =  Tqir2 

qi  ||&i  Hd-erall  92  l|rqiac||  ||rqir2|| 

where  Tqi  :=  (/  -  gigiT) . 


Geometrically,  q2  is  a  unit  vector  tangent  to  the  image-sphere  at  the  point  q±.  The  unit  vectors  q i  and  q2  are 
mutually  orthogonal.  Consider  the  plane  containing  the  camera  origin  oc,  the  point  b ,  and  the  vector  a.  The  unit 
vector 

<?3  =  <7i  x  q2 

is  normal  to  that  plane.  Thus,  we  define  a  function  c2  :  V  — >  SO(3) 


Q  l 


Q2  <7.3]  =  Q , 


(7) 


identifying  T1S2  with  SO(3). 


2.4  Diffeomorphism  to  Image-space 
Claim  1.  The  function  c:  V  — *  X,  defined  by 

c(H)  :=  (c2(H),Ci(d))  ,  where 
1  =  j(<3,  A,  s)  G  SO(3)  x  (0, 1)  x  S2  :  qx  ■  s  >  sj\-  A2} 
and  Q  =  [<ji  92  Qs] 


is  a  diffeomorphism,  i.e.  V  ~  X. 

The  proof  is  given  in  Appendix  A 


(8) 

(9) 

□ 


3  Image  Jacobian 

To  be  of  practical  application  to  VS  we  present  a  representation  of  the  tangent  map  Tc:TV  — >  TT,  its  inverse 
Tc _1,  and  the  cotangent  map  T*c:  T*T  — >  T*V,  with  the  following  commutative  diagram  in  mind: 

Tc 

TV Z  ^  T1 


T*V  _ 1  T*1 

t-c 

We  identify  the  tangent  space  TSE(3)  of  the  Lie  group  SE(3)  with1 

TSE(3)  ~  SE(3)  x  se(3)  ~  SE(3)  x  (K3©K3), 


xThe  Lie  algebra  R3  ©R3  is  R3  X  R3  with  the  Lie  bracket  structure  found  in  [13]. 


(10) 
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where  sc(3)  is  the  Lie  algebra  of  SE(3).  The  identification  occurs  via  “right  translation,”  i.e. 


{H,H)^{H,HH~l)^{H,  («,«)) 


(11) 


where 


d 

1  ’ 


R  d 
0T 


and  the  isomorphism  M3  ~  so (3)  is  defined  by 


and 


w=  ( RR-'Y 

v  =  —RR~1d  +  d 


Wi 

0 

-W3 

OJ  2 

V  . 

5 

0 

—UJ3 

cu2 

Wi 

U>2 

1 - > 

CU3 

0 

-Wl 

U>3 

0 

-Wi 

1 — > 

U)2 

U>3_ 

— U>2 

W 1 

0 

Wi 

0 

U>3 

where  so (3)  is  the  Lie  algebra  of  SO(3).  More  detail  can  be  found  in,  for  example  [13]. 

Similarly,  for  each  y  =  (■ Q ,  A,  s)  =  X  C  SO(3)  x  (0, 1)  x  S2,  we  have  the  following  identification 


Tyl  =  TqSO(3)  x  Ta(0,  1)  x  Ts  S2  -I3xRx  Ts  S2 


where  we  identify  Tq SO(3)  with  so(3)  ~  K3,  again  via  right  translation 

(Q,  Q)  e->  (Q,  |)  where  $,  =  ■ 


(12) 

(13) 


Hence,  to  compute  T#c  we  find  the  mapping  relating  the  tangent  space  identifications  made  above  in  (10)  and 
(12),  namely 


(H,  (u,v))  i  ^  (y,  (£,  A,s)) 


where  y  =  (Q,  A,  s)  =  c(7L) , 

i _ 

=  c(y) 

U 

V 

L«J 

C(y)  THC\H  =  c-l(yj 

hx3  |(£ QiqI  -  Q-iql  +  93 qDI 

=  0lx3 

-S  ^(hx3-ssT) 

where 

S=—J=L=,  (3  =  |  ("cos  (j)  -  \J\2  -  sin2  ^  , 

\/ A2  -  sin2  <t>  A  V  v  / 

cos  (j)  =  s  ■  qi  and  sin  <j>  =  yA  —  (s  •  q^)2  . 


(14) 


The  construction  of  C  is  straight  forward.  The  details  are  given  in  Appendix  B. 

To  compute  Tc^1,  and  T*c  is  now  straight  forward.  Using  the  above  representations,  we  have 

Tvc~l=  {C{y)TC{y))~1C{y)T  and  T*yc  =  C(y)T  .  (15) 

Note  that  the  expression  for  T!yc_1  is  not  a  pseudo-inverse.  The  possible  confusion  arises  since  the  six  dimensional 
tangent  space  TyT  is  locally  embedded  in  K7.  It  should  be  noted  that  in  many  image-based  visual  servoing  strategies 
employ  the  pseudo-inverse  of  the  image  Jacobian  since  the  image  feature  points  are  treated  as  though  moving  freely 
in  W1. 
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4  Controller 


For  the  present  work,  we  consider  the  case  of  so-called  “eye-in-hand”  VS,  wherein  the  camera  moves  relative  to  the 
body  which  serves  as  an  inertial  reference  frame.  Let  (L2,  V)  denote  the  angular  and  spatial  velocities,  respectively, 
of  the  camera  relative  to  the  fixed,  inertial  body  frame.  Let  G  =  IT-1  denote  the  transformation  of  the  camera 
frame,  Tc,  relative  to  the  inertial  body  frame,  Tb-  Note  that 


h 

oT 


V 

0 


=  G-1G  =  — iX/T-1 


v 

0T  0  ’ 


(16) 


effectively  mapping  the  identification  of  TSE(3)  given  by  the  right  translation  of  H  in  (10)  and  (11)  to  the  left 
translation  of  G  =  ^(lT-1).  Note  that  this  relationship  clears  up,  once  and  for  all,  the  kinematic  distinction 
between  “eye-in-hand”  servoing  and  the  so-called  “fixed-camera”  configuration,  wherein  the  camera  is  fixed  and 
the  body  is  moving. 

For  simplicity,  we  posit  a  fully  actuated  purely  kinematic  plant  model 


G  =  G 


D 

0T 


V 

0 


(17) 


where  we  treat  (L2,  V)  £  R3(S)R3  as  control  inputs.  We  generalize  this  to  a  dynamical  free  rigid  body  in  Appendix 
C.  One  possible  control  strategy  involves  planning  a  path  yd{i)  £  X  that  moves  from  the  initial  configuration  to 
the  goal  state  and  following  the  path  via 


n 

V 

=  -TyC-1 

1 

^3  ^3 

1 _ 

(18) 


•  T 

where  ,  A<j ,  Sd]  is  the  desired  velocity  yd,  expressed  using  the  tangent  space  identification  in  (12).  The 
minus  sign  in  the  above  expression  arises  due  to  the  identification  made  above  in  (16). 


4.1  Visual  Servoing  via  Navigation  Functions 

The  diffeomorphism  c,  the  visible  set  V,  and  its  relatively  simple  image  X,  provide  tremendous  leverage  into  the 
VS  problem.  Given  a  desired  configuration  G*  =  (IT*)-1,  measured  through  its  image  y*  =  (Q*,X*,s*)  =  c(H*), 
there  are  many  possible  image-based  control  strategies  we  can  employ  to  achieve  our  objective  of  driving  G  — >  G*. 

An  open-loop  strategy,  such  as  the  one  above  in  (18),  may  be  undesirable.  However,  the  generation  of  ijd  can 
also  be  conceived  as  a  feedback  law,  for  example  by  using  the  method  of  Navigation  Functions  (NF’s)  [8,  9,  14].  A 
substantial  benefit  of  using  NF’s  is  that  they  allow  us  to  “lift”  our  kinematic  controller  to  second  order  settings  with 
little  additional  effort,  while  maintaining  similar  convergence  guarantees  (as  we  do  for  this  problem  in  Appendix 
C).  Moreover,  these  methods  have  already  proven  practicable  for  dynamic  VS  [5]. 

Let  Dclbe  compact  “safe”  domain.  If  we  carefully  design  an  artificial  potential  function  ip :  V  — >  [0, 1],  then 
by  letting 

l 'id  =  (19) 

the  control  law  given  by  (18)  drives  G  so  that  y  converges  to  y* ,  except  for  a  set  of  measure  zero.  The  following 
definition,  adapted  from  [8],  gives  a  set  of  conditions  that  guarantee  essentially  global  convergence  of  the  above 
controller  (18),  with  yd  given  in  (19). 

O 

Definition  1.  Let  V  be  a  smooth  compact  connected  manifold  with  boundary,  and  y*  £T>  be  a  point  in  its  interior. 
A  Morse  function,  p  £  C2[V ,  [0, 1]]  is  called  an  Navigation  Function  if 
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1.  g>  takes  its  unique  minimum  at  tp(y*)  =  0; 

2.  (f  achieves  its  maximum  of  unity  uniformly  on  the  boundary,  i.e.  dV  =  1  (1) . 

For  any  function  satisfying  the  above  definition,  the  controller  given  by  (18)  will  ensure  convergence  y  tC°2>  y* 
from  all  initial  conditions  in  T>.  For  more  information,  see  [8]. 

4.2  Computing  a  Safe  Domain  and  Navigation  Function 

The  next  step  is  to  compute  a  compact  domain  V  C  I  that  is  “safe”  with  respect  to  the  FOV  of  our  camera  system 
in  the  sense  that  if  G_1  =  H  £  c-1(2?)  then  all  the  necessary  features  are  visible.  To  illustrate,  we  treat  the  FOV 
as  a  cone  originating  at  the  camera  origin,  with  center  along  e3,  as  shown  in  Figure  2.  This  cone  reduces  to  a 
constraint  on  s  and  A,  namely 

f(y)  ■=  A se3-  y/(l  -  A2)(l  -  (s  •  e3)2)  >  cos 9 

where  9  is  the  angle  from  e3  to  the  edge  of  the  FOV  cone.  Additionally,  we  constrain  A  €  [Amin,Amax]  C  (0,1) 
where  the  parameters  Am;n  and  Amax  effectively  keep  the  camera  from  moving  too  far  from  or  too  close  to  the 
camera  body,  respectively.  Finally,  we  keep  qi  from  being  too  close  to  the  edge  of  the  projected  circle,  namely 
qi  ■  s  +  e  >  y/1  —  A2.  Putting  these  constraints  together  yields  the  compact  manifold 

V  =  {y  =  (Q,  A,  s)  G  SO(3)  x  [Amin,  Amax]  x  S2  : 

f{y)  >  cos  9,  A  min  —  ^  —  Amax }  Cl, 
where  9  G  (0,  7t/2) ,  0  <C  Amin  <  Amax  <  1 . 

Clearly  T>  Cl.  Given  this  domain,  one  must  construct  an  NF  on  T>.  The  construction  of  g>  represents  work  in 
progress,  however,  we  conjecture  that  given  the  relatively  simple  geometry  of  T>,  that  constructing  a  suitable  NF 
should  be  straight  forward.  In  fact,  we  believe  (but  have  not  yet  formally  shown)  that  T>  ~  [0, 1] 5  x  S1  which  is 
the  same  topology  for  which  an  NF  has  already  been  constructed  for  VS  by  the  first  author  and  colleagues  [5]. 

5  Conclusion 

In  this  paper,  we  presented  a  global  diffeomorphism  from  a  large  subset  of  configurations  in  SE(3)  —  those  that 
are  “visible”  —  to  an  appropriately  defined  image  space.  Such  constructions  provide  tremendous  leverage  because 
they  shed  light  on  the  geometry  of  occlusion  free  servoing  as  well  as  provide  a  clear  pathway  to  construct  global 
dynamical  visual  servoing  systems  by  using,  for  example,  Navigation  Functions. 

A  global,  sensor-based  representation  of  the  configuration  space  leaves  many  open  doors.  For  example,  the 
control  of  underactuated  and  kinematically  nonholonomic  systems  becomes  possible  in  sensor  space.  The  work 
presented  in  this  paper  represents  only  the  tip  of  the  iceberg.  Now  that  we  now  know  it  is  possible  to  globally 
represent  rigid  motion  using  image  coordinates,  we  would  like  to  construct  a  more  general  class  of  diffeomorphisms 
to  the  image  plane  that  does  not  require  designing  special  visual  targets.  We  believe  that  with  proper  insight,  the 
projection  of  a  collection  of  rigidly  connected  feature  points  may  be  interpreted  geometrically,  again  enabling  a 
global  representation  of  visible  configurations.  For  example,  perhaps  depth  can  be  described  in  terms  of  “moments” , 
as  suggested  by  Hamel  and  Mahoney  [6],  and  orientation  can  be  described  in  terms  of  the  projection  of  two  or 
three  feature  points. 
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A  Proof  That  c :  V  — »  X  is  a  Diffeomorphism 

The  proof  proceeds  in  four  parts.  First,  we  show  that  c  is  smooth  on  V.  Next  we  show  that  c(V)  C  X.  Third,  we 
show  that  c  is  bijective  by  explicitly  computing  its  inverse,  c-1,  on  T.  Finally,  we  show  that  c-1  is  smooth  on  T. 


The  function  c  is  smooth. 

The  function  c  is  composed  of  smooth  functions  away  from  the  set  where  the  arguments  of  ||  •  ||_1  become  zero. 
But  those  arguments  are  nonzero  on  V.  In  particular: 

1.  Equation  (3)  depends  on  ||d||_1.  However,  H  £  V  implies  ||d||  >  g. 

2.  Equation  (6)  depends  on  ||d  —  £>r3||_1.  Visibility  implies  ||d||  >  g,  which  in  turn  implies  \\d  —  gr3\\  > 

l|d||-M  =  M-e>o. 

3.  Equation  (6)  depends  on  || Tqi 7^2 1| —  1  -  This  blows  up  iff  q\  =  ±r2,  i.e. 


Q  l 


d-  gr3  = 
\\d-er  3|r  ^ 


and  hence,  from  (5) 


v(H)  =(d-  gr3 )  •  r3  =  ±  \\d  -  gr3||  r2  ■  r3  =  0  , 


H£V. 


This  contradiction  implies  that  ||r(?1li,ab||  >  0  for  H  £  V. 
Hence,  c  is  smooth  on  V. 
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The  image  of  V  indeed  is  contained  in  X. 

To  see  that  c(V)  C  X,  let  (Q,  A,  s)  =  c(H).  By  construction  of  c,  we  have  that  ( Q,X,s )  G  S0(3)  x  (0,1)  x  S2.  To 
show  that  qi  ■  s  >  -\/l  —  A2,  note  from  (6)  that 


( d-gr3)-s  ||d||  —  gs  ■  r3  1  —  Xa 

Ql'S=  \\d-gr3\\  =  yV  -  2d  •  r3  +  pp  =  Vl  -  2An  +  A2 

where  a  =  s  ■  r3.  From  (5),  a  >  A  to  ensure  v(H)  >  0.  Since  the  right  hand  side  reaches  its  minimum  of  Vl  —  A2 
for  a  =  A,  we  have  that  qi  ■  s  >  y/1  —  A2. 

The  function  c  is  bijective. 

Consider  any  H  €  V,  with  rotation  f?  =  [iq  rq  rq]  and  translation  d  as  usual.  Let  c(H)  =  ( Q,X,s )  €  X, 
where  Q  =  [q i  <72  <7.3]  ■  Given  (A,  s),  recovering  the  translation  from  (4)  is  trivial,  namely  d  =  c)"1(A,  s)  G  B. 
Recovering  the  rotation  from  (Q,  A,  s )  requires  a  bit  more  care. 

Consider  the  triangle  defined  by  oc,  Ob  and  b.  The  points  oc  and  Ob  =  oc  +  d  are  known,  but  b  is  unknown.  Let 
b  =  b  —  oc  and  note  that 

bc  =  f3qi ,  for  some  (3  G  R . 

Moreover,  b  lies  on  the  surface  of  the  spherical  body  of  radius  g  centered  at  Ob ■  Let  (j>  denote  the  known  angle 
between  b  —  oc  and  Ob  —  oc ,  given  in  the  camera  frame  by 

cos  (j>  =  qi  ■  s  >  \/l  —  A2  >  0  =>  </>  G  [0,  7t/2)  . 

From  above  know  the  lengths  of  two  sides  of  the  triangle  oc,Ob,b ,  namely  |jd||  =  g/X  and  g,  one  angle,  (j> .  From 
the  law  of  cosines,  we  have 

f32  +  ||d||2  —  2/3||d||  cos  </>  =  g2  =^>  (3  =  ^  ^cos  (j)  +  o\j A2  —  sin2  </> 

where  a  =  ±1.  In  general,  there  may  be  zero,  one  or  real  solutions  for  (3.  However,  the  requirement  that 
s  ■  <71  >  V 1  —  A2  implies  that  $  G  [0,^max),  where  ^max  =  arccos  \J\  —  A2  (where  arccos  is  taken  in  the  first 
quadrant).  This  implies  sin2  cj)  <  A,  and  thus  there  are  two  algebraic  solutions  for  f3.  Note  that  r3  =  (d  —  b)/g, 
and  thus 


v(H)  =  b  ■  (d  —  b)/g=  ((3\\d\\  cos  </>  —  (32)/g  =  (/3/A) (cos  (j>  —  A/3/ g) 
=  (/3/A)  ^cos  (j)  —  ^cos  cf)  T  a\J A2  —  sin2  cj^j  ^ 

/3  rx,  ^  2/7 

=  —a—\l\z  —  sin  0 
A  v 


It  is  easy  to  show  that  (3  >  0  for  either  choice  of  o\  hence,  visibility  implies  <j  = 
compute 


b=||6||qi,  where  ||b|| 


-1, 


allowing  us  to  uniquely 


(21) 


Thus,  r3  =  (d  —  b)/g. 

From  (6),  r2  G  spanjqi, <72},  namely 


r2  =  cciQi  +  a2<72 


(22) 
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for  some  o-\  and  oli-  Note  from  (6)  that  ot2  >  0.  Moreover,  r3  •  r2  =  0,  hence 

v,  5x  1 


M,  4x5 


r3  0 
I  -gi 


0 

-Q2 


r2 

Oi  1 

-1 

OL2 

=  0  and  ||r*2 II  =  1,  a2  >  0  . 


The  matrix  M  has  a  one-dimensional  kernel  since  qi,q2  are  linearly  independent  and  hence  there  are  two  possible 
solutions  to  Mv  =  0  for  r2,  02,0:2  subject  to  ||r2||  =  1,  the  ambiguity  of  which  is  eliminated  since  a2  >  0. 
Combining  the  above  computations  yields  the  unique  inverse  to  c, 


r i  r2  r3  d 
0  0  0  1 


where 


c  1 :  ( Q ,  A,  s)  H  = 


d=  ^s,  v3  =  A  [  g  —  (  cos  (j)  —  \/ X2  —  sin2 
A  A 


q  i 


(23) 


r3  -  q2 

r  2=  q2 - <7i 

r3qi  , 


r3  ■  q 2 

q2 - <7i 

r  3  •  q  i 


and  r\  =  r2  x  r3  . 


The  function  c  1  is  smooth. 

Finally,  we  need  only  show  that  c-1  is  smooth.  But,  c-1  is  composed  of  smooth  functions.  There  are  two  caveats: 

1.  Equations  involving  1/A.  This  is  fine  since  0  <  A  <  1. 

2.  Equation  for  r2.  First,  note  that  r3  ■  q\  =  r3  ■  b/||b||  =  ^(Ff)/|jb||  >  0.  Also,  since  q2  and  qq  are  linearly 
independent,  the  denominator  can  never  be  zero,  so  this  equation  is  smooth  on  I. 

Hence  c^1  is  smooth,  and  c  is  a  diffeomorphism  c:  V  «  T.  □ 


B  Computation  of  the  Image  Jacobian 

We  compute  the  image  Jacobian  matrix  given  by  (14)  algebraically  in  B.l,  and  then  verify  this  geometrically  in 
Section  B.2. 


B.l  Algebraic  Computation  of  Image  Jacobian 

B.1.1  Computation  of  A. 


Recall 


where  uj  =  ( RR  x)v.  Since  A  =  g/||d||,  we  have 


d  =  u>  x  d  +  v 


;  Q  ,  *  A2 

A=  ~TTAind'd=: - s'v- 

||d||3  q 


(24) 

(25) 


B.l. 2  Computation  of  s. 

Recall  s  =  d/|jd|j.  Equation  (24)  implies 
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B.1.3  Computation  of  £  =  ( QQ  1)v. 

For  simplicity,  we  will  first  compute  Q~XQ  and  then  use  the  property 

Sx  =  {SxS~1)v  V5e  SO(3),ieR3, 


to  compute 


Proceeding  in  this  manner,  we  have 


£  =  Q(Q~1Q)V. 


r r  °  91-92  91-93 

Q~XQ  =  ql  [91  92  93]  =  92  •  gi  0  g2  -  <73 

ql  93  •  9i  93  -92  0 


which  implies 


( Q  lQY  =  “93  •  9i  • 

92  •  9i 


-g2  •  93 


It  follows  that  we  need  to  compute  the  three  quantities  92-91,  93-91,  92  ■  93  in  terms  of  u>  and  v  in  order  to  get 
an  expression  of  £  in  terms  of  u>  and  v.  We  remark  that  we  chose  92-91,  93-91,  92  •  93  rather  then  the  other 
possible  three  quantities  because  the  involved  computation  is  relatively  simple. 

Let  us  first  compute  q\ .  Recall 

(d  -  gr3 ) 

91  "  ll&ll 

where  b  =  d  —  gr 3.  From  (24)  and  r3  =  u>  x  r3,  it  follows 

9 1  =  pjj-  (d~  Qr3)  +  (d-  (?r3)-^-(||b||^1) 

=  w  x  qi  +  T-^yrU  +  Aqi  (29) 

ll»ll 

where  A  =  ||b|| ^(||b||-1)-  From  (29),  it  follows 


92  •  9i  =  93  •  w 


93  •  9i  =  -92  •  W 


We  now  start  to  compute  q2  -93-  Recall 


q2  -  Sqi 
1192  -  <5gi 


This  implies 


II 9i  x  r*2 1|  —  Ti - j — iT  —  mi  ■  •  2  —  ,  \ 

II 92  —  <5gi||  V 1  +  52  y/1  +  5 2 

which  will  be  used  later  in  computations.  From  (32),  we  have  93  =  91  x  q2  =  (91  x  *“2) /||9i  x  t*2 ||-  It  follows 


793  •  V. 


r 3  ■  g2 
r3  -  qi 


qi-r2  =  — 


93  =  Ti - m(9i  x  r2  +  91  X  ^2)  +  Bq3 

9i  x  r2\\ 
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where  B  =  ||gi  x  r2||)|(||gi  x  r2||  1).  Hence, 

92  •  <73  =  (1  +  52)  s  (r2  •  (g2  x  <71)  -  r2  •  g3) 

=  (1  +  <S2)5(r2  •  (g2  x  qO  +r2  •  g3)  (35) 

where  in  the  second  equality  we  used  the  fact  that  r2  ■  q3  =  0  from  (32)  implies  r2  •  q3  =  —r2  •  q3. 

Using  (29),  we  have 


r2  •  (<72  x  <71)  =  -(qi  •  r2){q2  •  w)  +  —  (r2  x  g2)  • 

llbll 


=  -(9i  '  r2){q2  •  w)  - 


IbllvTTF 


(<73  •  v) 


where  we  used  r2  x  q2  =  —  5(1  +  <52)  (q3i  which  comes  from  (32)  and  (33). 
From  (34),  r2  •  q3  =  0  and  r2  =  uj  x  r2,  we  get 


r-2  ■  93  = 


1 


|<7i  x  r2| 
1 

|gi  x  r2| 


j-9i  •  ((w  •  r2)r2  -  lj) 

r<7i  ’  (Iki  x  r2||(g2  •  w  -  Sqx  ■  uj)r2  -  uj) 


=  (<7i  '  ^2)  (92  •  w)  - 


(gi  • 


where  we  used  (32)  in  the  second  equality  and  (33)  in  the  third  equality. 
Plugging  (36),  (37)  and  (33)  into  (35),  we  get 


(36) 


(37) 


92  ■  93  =  —91  •  u 


l&l 


93  •  V 


where  S  can  be  expressed  in  terms  of  y  =  (Q,  A,  s)  as 


5  = 


s  ■  92 


sin (j)  =  y/l-(s-  gi)2. 


y/X2  —  sin2  (j) 

We  are  now  in  a  position  to  compute  £  in  (27).  From  (27)  and  (28)  together  with  (30),  (31),  (38),  we  get 

£  =  ^  +  ||^||-(5qiq3  -  q2ql  +  q3q2)v 

where  5  is  given  by  (39)  or  (32),  and  ||6||  can  in  expressed  in  terms  of  y  =  ( Q ,  A,  s)  as  follows: 

,|6||  =  _ Q _  f  cos(f>  =  s-qi1 _ 

A(cos  (j>  —  yj A2  —  sin2  <j>)  l  sin  41  —  \/l  —  (s  •  9i)2 


(38) 


(39) 


(40) 


(41) 


B.1.4  Computation  of  the  Jacobian. 

The  tangent  map  of  the  map  y  =  c{H)  in  the  following  relation 

A 
s 


=  Thc 
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is  written  as 


THC\ H=c-i(y)  - 

where  6  is  given  by  (39),  and  ||b||  is  given  by  (41). 


^3x3 

Olx3 

—s 


INI 


{SqiqJ  -  92 ql  +  q^ql) 

-TsT 

){I3x3-sst) 


B.2  Geometric  Computation  of  Image  Jacobian 

Let  pb  denote  an  arbitrary  point  fixed  in  the  body  frame,  expressed  in  the  camera  frame  as  p  =  Hpb .  Let  p  =  p  —  oc 
denote  the  vector  from  the  camera  origin  to  the  point  p.  Note  that 


P  = 


=  Hpb  =  SHpb  =  SHp 


hence,  we  may  write 


P  =  [-P  I] 


“cl  =  Hp  i>= 


Thus,  it  follows  that 


"vpT)] 


B.2.1  Computation  of  A  and  s. 

It  follows  from  above  that 


A  = 

0T 

UJ 

and  s  = 

L  Q  \ 

V 

- 

A 

e 


(/  —  ssT) 


B.2. 2  Computation  of  q\. 

Recalling  that  qi  =  6/||6||  we  have 

qi=[-qi  \\b\l~1  (i  -  qiqT)\ 


where  |jb|j  is  given  by  (21). 


(42) 


(43) 


B.2. 3  Computation  of  q3. 

Consider  the  line  defined  by 

l  =  {b  +  ar2,  Va  €  M}, 

which,  together  with  the  camera  origin,  determines  the  plane  II.  Let  Iq  £  £  denote  the  point  on  the  line  closest  to 
the  camera  origin,  and  l  =  l  —  oc  the  corresponding  vector  from  the  camera  origin,  i.e  l  ■  r2  =  0.  Hence 

93  =  M x  r2’  where  1  =  ^  ~ r2r ^ b ' 
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To  compute  g 3,  we  simply  compute  ^(Z/||Z||)  and  r2,  and  use  the  product  rule.  Although  it  is  tempting  to  simply 
write 


\-l 


(l-\\l\\-2UT)] 


(wrong!), 


d  l  __  1 

it  is  important  to  note  that  l  is  not  fixed  with  respect  to  the  body  —  it  also  depends  on  the  camera  location.  In 
fact,  for  pure  translations  parallel  to  r2,  l  =  0.  More  generally, 


Noting  that  r2 


Hence 


d  l  1 


dt  11*11  11*11  [l  9393T] 


UJ 


— yields 


=  -ra  x  |  JIT)  +  pjl  x  r2  =  A 


-l  x  r2  -r2q3q3 


u 

V 

<73=1-53  -||*||  2*<73T1 


(44) 


B.2.4  Computation  of  q2. 

Note  that  q2  =  93  x  9i,  we  have 

<72  =  <73  X  <7i  -  qi  X  <73 

=  qi  (-gT<*>  +  H&r1^  -  qiQi)v)  -  qi  (~qi&  -  ||z|r2Zg3  • 

=  “92^  +  llbll"1  (qi  -  q2qf)  v  +  \\l\\~2  qllq3v. 


B.2.5  Computation  of  £. 

As  in  B.l,  we  compute 


d  =  Q(Q~1Q)y  ■ 

For  convenience,  we  compute  q3  •  q2,  q3  ■  q3  and  q2  •  q3.  Note  that 

IKII2  =  l|b||2(l  -  (^2  •  <7i)2),  <71  •  *  =  ||b||(l  -  (r2  •  9i)2), 

and  q2  ■  l  =  -\\l\\r2  ■  qi , 


and  hence 

<7i  '  <72  =  ~<7i  ’  (<72  x  w)  +  Ubir^i  •  (g3  x  v)  =  -q3  ■  u  -  ||b||_1g2  ■  v, 
Qi  ■  <73  =  ~qi  ■  (93  x  w)  -  ||*|r2(qi  •  *)(g3  •  v)  =  92  •  w  -  ||b||_1q3  • 
and,  using  r2  and  r3  found  in  (23), 


Hence 


<72  ■  <73  =  “<72  •  (<73  x  w)  -  ||*||2(<72  '  *)(93  •  n) 

ii,  n-ir-3  •  92 

=  —9!  •  U)  —  ll&ll  - - —93  V 


=  -qi  ■  u  -  ||b| 


r 3  ■  9i 
-1  92 • S 


X2  —  sin2  </> 


93  •  V 


£  =  uj+  ||6j 


-1 


92  •  S  rp  rp  rp  \ 

qiq3  -  9293  +  9392  v 


—  sin2  (j) 


as  before  in  Section  B.l. 
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C  Application  to  Dynamical  Rigid  Body  Servoing 

For  second  order  systems,  <p  serves  as  an  artificial  potential  function  which  generates  the  force  T*c  (d<p)T.  Adding 
a  suitable  damping  force,  B  :  TV  — >  T*V,  yields  the  controller 


F  = 


F  i 
F2 


T*c{dip)T  -  B(TL,V) ,  where  y  =  c(iF)  =  c(G'-1) 


(45) 


and  F  £  TqV  ~  M3(S)]R3  represents  the  torque  and  force  applied  to  the  camera.  Assuming  the  rigid  body  is  not 
subject  to  any  additional  external  forces  (or  such  an  external  potential  or  gravity)  or  assuming  such  forces  can  be 
cancelled  by  our  controller,  we  model  with  the  standard  rigid  body  dynamical  equations 


G  =  G 


h 

oT 


V 

0 


ri  =  n  x  cl  +  Fi , 


p  =  p  x  cl  +  f2  , 


(46) 


where  II,  P  are  the  linear  and  angular  momenta,  respectively,  given  by 


II  =  If2  P  =  mV 


where  m  is  the  mass  of  the  camera  rigid  body  system,  and  I  =  IT  >  0  is  the  angular  inertia  matrix. 

Note  that  for  our  controller,  we  do  not  need  to  know  the  inertia  of  the  camera,  and  yet  we  are  still  guaranteed 
asymptotic  convergence.  In  fact,  one  can  show  that  convergence  H  tC°°>  ]i*  js  guaranteed  from  all  initially 
conditions  on  T*T>  whose  energy  is  less  than  1.  For  more  information,  see  [8]. 


C.0.6  If  the  camera  frame  is  not  at  the  center  of  mass. 


Attach  a  frame  Tc  at  the  pinhole  of  the  camera,  Pcm  at  the  center  of  mass  of  the  camera  +  robot  system,  and  Tb 
to  the  center  of  the  object  being  observed.  Here  we  assume  that  Tb  is  the  inertial  frame.  Let  G  £  SE(3)  be  the 
transformation  from  Tc  to  Tb  and  Go  G  SE(3)  be  the  transformation  from  Tcm  to  Tc.  Assume  that  Go  does  not 
change  in  time.  The  composition  Gcm  :=  GGq  is  the  transformation  from  Pcm  to  Tb ■  The  body- fixed  velocity  of 
the  frame  Tcm  is  given  by 


G-^Gcm  =  (GG0) 


-id 


^(GG0)  =  Go1(G“1G)G0. 


Let 


GCmGcm  — 


r^cm  V  cm 

,  G~1G  = 

fi 

V 

,  Gh1  = 

Bo  bo 

O 

O 

0 

0 

’  U 

0  1 

(47) 

(48) 


Notice  that  G0  1  is  the  transformation  from  frame  Tc  to  frame  Pcm.  By  (47)  and  (48),  it  follows 


(LJcmi  V cm)  —  (BqCI,  ~BqCI  X  +  BqV c), 


^cm 

\B0 

0  ' 

CL 

Vcm_ 

boB0 

Bo. 

Vc. 

Readers  with  knowledge  of  Lie  groups  will  readily  see  that  what  we  computed  is  the  adjoint: 

(Ocm,Vcm)  =  AdG-i(H,  Vc). 
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We  now  study  how  torques  and  forces  are  transformed.  Let  (rc,  fc)  be  the  force  measured  in  the  frame  Tc  and 
(Ton,  /cm)-  One  can  easily  see  the  following  relationship: 


7*  cm 

J 

'Bo 

0  ' 

-y 

Tc 

'B0 

bo  Bq 

Tc 

f  cm_ 

‘ 

poBo 

B0_ 

fc. 

0 

Bo  _ 

fc. 

which  can  be  compactly  written  as 

('I'cmi/cm)  =  Adfj0  (Tc,  / c). 

The  equations  of  motion  in  frame  Tcm  is  given  by 

bL:m  —  ncm  X  Ocm  -p  Tcm 
P  cm  ^cm  0(: m  “k  / cm 


with  ncm  =  Iflcm  and  Pc m  =  rnVcm  where  I  is  the  inertia  matrix  of  the  camera  +  robot  system  with  respect  to 
frame  Pcm  and  m  is  the  mass  of  the  camera  +  robot  system.  Here,  we  assumed  that  the  gravity  effect  has  been 
cancelled  out  by  control. 
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